Search CORE

305 research outputs found

BitSearch, the blog before the thesis

Author: Giró Nieto Xavier
Publication venue
Publication date: 01/01/2010
Field of study

Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

UPC at MediaEval 2013 social event detection task

Author: Giró Nieto Xavier
Manchon Vizuete Daniel
Publication venue: CEUR Workshop Proceedings
Publication date: 01/01/2013
Field of study

These working notes present the contribution of the UPC team to the Social Event Detection (SED) task in MediaEval 2013. The proposal extends the previous PhotoTOC work in the domain of shared collections of photographs stored in cloud services. An initial over-segmentation of the photo collection is later re ned by merging pairs of similar clus- ters.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

Hyperparameter-free losses for model-based monocular reconstruction

Author: Batard Thomas
Giró Nieto Xavier
Ramon Maldonado Eduard
Ruiz Guillermo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

This work proposes novel hyperparameter-free losses for single view 3D reconstruction with morphable models (3DMM). We dispense with the hyperparameters used in other works by exploiting geometry, so that the shape of the object and the camera pose are jointly optimized in a sole term expression. This simplification reduces the optimization time and its complexity. Moreover, we propose a novel implicit regularization technique based on random virtual projections that does not require additional 2D or 3D annotations. Our experiments suggest that minimizing a shape reprojection error together with the proposed implicit regularization is especially suitable for applications that require precise alignment between geometry and image spaces, such as augmented reality. We evaluate our losses on a large scale dataset with 3D ground truth and publish our implementations to facilitate reproducibility and public benchmarking in this field.Peer ReviewedPostprint (author's final draft

arXiv.org e-Print Archive

Crossref

UPCommons. Portal del coneixement obert de la UPC

Temporal activity detection in untrimmed videos with recurrent neural networks

Author: Giró Nieto Xavier
Montes Alberto
Pascual Santiago
Salvador Aguilera Amaia
Publication venue
Publication date: 01/01/2016
Field of study

This work proposes a simple pipeline to classify and temporally localize activities in untrimmed videos. Our system uses features from a 3D Convolutional Neural Network (C3D) as input to train a a recurrent neural network (RNN) that learns to classify video clips of 16 frames. After clip prediction, we post-process the output of the RNN to assign a single activity label to each video, and determine the temporal boundaries of the activity within the video. We show how our system can achieve competitive results in both tasks with a simple architecture. We evaluate our method in the ActivityNet Challenge 2016, achieving a 0.5874 mAP and a 0.2237 mAP in the classification and detection tasks, respectively. Our code and models are publicly available at: https://imatge-upc.github.io/activitynet-2016-cvprw/Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Skin lesion classification from dermoscopic images using deep learning techniques

Author: Burdick Jack
Giró Nieto Xavier
Marques Oge
Romero-Lopez Adrià
Publication venue: 'ACTA Press'
Publication date: 01/01/2017
Field of study

The recent emergence of deep learning methods for medical image analysis has enabled the development of intelligent medical imaging-based diagnosis systems that can assist the human expert in making better decisions about a patient’s health. In this paper we focus on the problem of skin lesion classification, particularly early melanoma detection, and present a deep-learning based approach to solve the problem of classifying a dermoscopic image containing a skin lesion as malignant or benign. The proposed solution is built around the VGGNet convolutional neural network architecture and uses the transfer learning paradigm. Experimental results are encouraging: on the ISIC Archive dataset, the proposed method achieves a sensitivity value of 78.66%, which is significantly higher than the current state of the art on that dataset.Postprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

System architecture for indexing regions in keyframes

Author: Giró Nieto Xavier
Marqués Acosta Fernando
Publication venue: SAMT 2008
Publication date: 01/01/2008
Field of study

This paper describes the design of an indexing system for a video database. The system uses region-based manual an- notations of keyframes to create models to automatically annotate new keyframes also at the region level. The pre- sented architecture includes user interfaces for training and querying the system, internal databases to manage ingested content and modelled semantic classes, as well as communi- cation interfaces to allow the system interconnection. The scheme is designed to work as a plug-in to an external Mul- timedia Asset Management (MAM) system.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

Assessing knee OA severity with CNN attention-based end-to-end architectures

Author: Antony Joseph
Giró Nieto Xavier
Górriz Marc
McGuinness Kevin
O'Connor Noel
Publication venue
Publication date: 01/01/2019
Field of study

This work proposes a novel end-to-end convolutional neural network (CNN) architecture to automatically quantify the severity of knee osteoarthritis (OA) using X-Ray images, which incorporates trainable attention modules acting as unsupervised fine-grained detectors of the region of interest (ROI). The proposed attention modules can be applied at different levels and scales across any CNN pipeline helping the network to learn relevant attention patterns over the most informative parts of the image at different resolutions. We test the proposed attention mechanism on existing state-of-the-art CNN architectures as our base models, achieving promising results on the benchmark knee OA datasets from the osteoarthritis initiative (OAI) and multicenter osteoarthritis study (MOST).Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

An interactive lifelog search engine for LSC2018

Author: Alsina Adrià
Giró Nieto Xavier
Gurrin Cathal
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

In this work, we describe an interactive lifelog search engine developed for the LSC 2018 search challenge at ACM ICMR 2018. The paper introduces the four-step process required to support lifelog search engines and describes the source data for the search engine as well as the approach to ranking chosen for the iterative search engine. Finally the interface used is introduced before we highlight the limits of the current prototype and suggest opportunities for future work.Peer ReviewedPostprint (published version

Crossref

UPCommons. Portal del coneixement obert de la UPC

Irish Universities

DCU Online Research Access Service

Hierarchical object detection with deep reinforcement learning

Author: Bellver Bueno Míriam
Giró Nieto Xavier
Marqués Acosta Fernando
Torres Jordi
Publication venue
Publication date: 01/01/2016
Field of study

We present a method for performing hierarchical object detection in images guided by a deep reinforcement learning agent. The key idea is to focus on those parts of the image that contain richer information and zoom on them. We train an intelligent agent that, given an image window, is capable of deciding where to focus the attention among five different predefined region candidates (smaller windows). This procedure is iterated providing a hierarchical image analysis. We compare two different candidate proposal strategies to guide the object search: with and without overlap. Moreover, our work compares two different strategies to extract features from a convolutional neural network for each region proposal: a first one that computes new feature maps for each region proposal, and a second one that computes the feature maps for the whole image to later generate crops for each region proposal. Experiments indicate better results for the overlapping candidate proposal strategy and a loss of performance for the cropped image features due to the loss of spatial resolution. We argue that, while this loss seems unavoidable when working with large amounts of object candidates, the much more reduced amount of region proposals generated by our reinforcement learning agent allows considering to extract features for each location without sharing convolutional computation among regions.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC